Skip to content

Fix support for API concurrency > 1 by updating Uvicorn launch logic#408

Open
DifferentialityDevelopment wants to merge 3 commits intorunpod:mainfrom
DifferentialityDevelopment:patch-1
Open

Fix support for API concurrency > 1 by updating Uvicorn launch logic#408
DifferentialityDevelopment wants to merge 3 commits intorunpod:mainfrom
DifferentialityDevelopment:patch-1

Conversation

@DifferentialityDevelopment
Copy link
Copy Markdown

Fix for the bug when launching the serverless worker locally with api concurrency > 1

I got the following error:
WARNING: You must pass the application as an import string to enable 'reload' or 'workers'.

As per my issue I created here
#407

Fix for the bug
WARNING:  You must pass the application as an import string to enable 'reload' or 'workers'.
@DifferentialityDevelopment
Copy link
Copy Markdown
Author

DifferentialityDevelopment commented Apr 10, 2025

I notice that running with concurrency is totally not a thing on Windows..
I was able to get it "working" on linux with the above changes
But it requires I give it test input data else the workers just exit immediately, if I just use concurrency = 1 then the worker persists.

I was going through all this to be able to stress test my application and gather data on per request memory usage, peak memory usage with different concurrency levels etc
Does concurrency_modifier even have any effect in development environment if testing with concurrency = 1
My worker handler itself is async

Copy link
Copy Markdown
Member

@deanq deanq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unit tests are broken

@sairambokka
Copy link
Copy Markdown

Unit tests are broken because of the addition of gunicorn as part of api_concurrency > 1. Should be a simple package install

@sairambokka
Copy link
Copy Markdown

sairambokka commented Jul 1, 2025

Screenshot 2025-07-01 at 2 17 50 PM Uvicorn.workers will probably be removed in a future release so it is better to switch to gunicorn as mentioned here: https://www.uvicorn.org/deployment/#gunicorn

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to fix local serverless worker startup when --rp_api_concurrency > 1 by changing how Uvicorn is launched for multi-worker mode (to avoid Uvicorn’s requirement to use an import string when workers is set).

Changes:

  • Adds branching logic in WorkerAPI.start_uvicorn() to use different uvicorn.run(...) invocation for api_concurrency > 1 vs == 1.
  • Switches the multi-worker path to an import-string target and enables factory=True.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +274 to 297
if api_concurrency > 1:
# For multiple workers, we need to use the module:app format
import uvicorn.workers
uvicorn.run(
"runpod.serverless.modules.rp_fastapi:app",
host=api_host,
port=int(api_port),
workers=int(api_concurrency),
log_level=os.environ.get("UVICORN_LOG_LEVEL", "info"),
access_log=False,
factory=True
)
else:
# For single worker, we can use the app instance directly
import uvicorn.workers
uvicorn.run(
self.rp_app,
host=api_host,
port=int(api_port),
workers=1,
log_level=os.environ.get("UVICORN_LOG_LEVEL", "info"),
access_log=False
)

Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the multi-worker branch, the import string points to runpod.serverless.modules.rp_fastapi:app, but this module does not define a module-level app symbol. Additionally, factory=True tells Uvicorn to treat that import as an app factory callable; even if app existed as a FastAPI instance, this would be invoked like a zero-arg function and fail. Define a proper module-level factory (and reference it), or expose a module-level ASGI app and remove factory=True so workers>1 can start reliably.

Suggested change
if api_concurrency > 1:
# For multiple workers, we need to use the module:app format
import uvicorn.workers
uvicorn.run(
"runpod.serverless.modules.rp_fastapi:app",
host=api_host,
port=int(api_port),
workers=int(api_concurrency),
log_level=os.environ.get("UVICORN_LOG_LEVEL", "info"),
access_log=False,
factory=True
)
else:
# For single worker, we can use the app instance directly
import uvicorn.workers
uvicorn.run(
self.rp_app,
host=api_host,
port=int(api_port),
workers=1,
log_level=os.environ.get("UVICORN_LOG_LEVEL", "info"),
access_log=False
)
import uvicorn.workers
# `self.rp_app` is the ASGI app instance created for this WorkerAPI.
# Running Uvicorn with an import string and `factory=True` requires a
# module-level callable that is not defined in this module, so always
# launch the existing app instance directly.
uvicorn.run(
self.rp_app,
host=api_host,
port=int(api_port),
workers=1,
log_level=os.environ.get("UVICORN_LOG_LEVEL", "info"),
access_log=False
)

Copilot uses AI. Check for mistakes.
Comment on lines +275 to +290
# For multiple workers, we need to use the module:app format
import uvicorn.workers
uvicorn.run(
"runpod.serverless.modules.rp_fastapi:app",
host=api_host,
port=int(api_port),
workers=int(api_concurrency),
log_level=os.environ.get("UVICORN_LOG_LEVEL", "info"),
access_log=False,
factory=True
)
else:
# For single worker, we can use the app instance directly
import uvicorn.workers
uvicorn.run(
self.rp_app,
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import uvicorn.workers is executed in both branches but the module isn’t used. Importing uvicorn.workers typically requires the optional gunicorn dependency; since this repo doesn’t declare gunicorn, this can raise ModuleNotFoundError and break even the single-worker path. Remove this import (or guard it behind an explicit optional dependency check if you truly need Gunicorn).

Copilot uses AI. Check for mistakes.
Comment on lines 270 to +296
def start_uvicorn(self, api_host="localhost", api_port=8000, api_concurrency=1):
"""
Starts the Uvicorn server.
"""
uvicorn.run(
self.rp_app,
host=api_host,
port=int(api_port),
workers=int(api_concurrency),
log_level=os.environ.get("UVICORN_LOG_LEVEL", "info"),
access_log=False,
)
if api_concurrency > 1:
# For multiple workers, we need to use the module:app format
import uvicorn.workers
uvicorn.run(
"runpod.serverless.modules.rp_fastapi:app",
host=api_host,
port=int(api_port),
workers=int(api_concurrency),
log_level=os.environ.get("UVICORN_LOG_LEVEL", "info"),
access_log=False,
factory=True
)
else:
# For single worker, we can use the app instance directly
import uvicorn.workers
uvicorn.run(
self.rp_app,
host=api_host,
port=int(api_port),
workers=1,
log_level=os.environ.get("UVICORN_LOG_LEVEL", "info"),
access_log=False
)
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

start_uvicorn now has branching behavior for api_concurrency > 1, but there are no unit tests asserting the parameters passed to uvicorn.run in either branch (especially the multi-worker import-string path). Adding targeted tests that patch runpod.serverless.modules.rp_fastapi.uvicorn.run and exercise both concurrency modes will help prevent regressions like missing symbols / wrong factory settings.

Copilot uses AI. Check for mistakes.
Comment on lines +274 to 297
if api_concurrency > 1:
# For multiple workers, we need to use the module:app format
import uvicorn.workers
uvicorn.run(
"runpod.serverless.modules.rp_fastapi:app",
host=api_host,
port=int(api_port),
workers=int(api_concurrency),
log_level=os.environ.get("UVICORN_LOG_LEVEL", "info"),
access_log=False,
factory=True
)
else:
# For single worker, we can use the app instance directly
import uvicorn.workers
uvicorn.run(
self.rp_app,
host=api_host,
port=int(api_port),
workers=1,
log_level=os.environ.get("UVICORN_LOG_LEVEL", "info"),
access_log=False
)

Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enabling workers > 1 means requests will be served by multiple processes. JobsProgress persists to disk, but methods used by the API (job_list.get(...) in _sim_stream/_sim_status and others) don’t reload state before reading, so a worker may not see jobs added by a different worker and return “Job ID not found”. If multi-worker mode is supported, consider reloading state on reads (e.g., in JobsProgress.get() / __iter__) or otherwise ensuring cross-process consistency for job tracking.

Suggested change
if api_concurrency > 1:
# For multiple workers, we need to use the module:app format
import uvicorn.workers
uvicorn.run(
"runpod.serverless.modules.rp_fastapi:app",
host=api_host,
port=int(api_port),
workers=int(api_concurrency),
log_level=os.environ.get("UVICORN_LOG_LEVEL", "info"),
access_log=False,
factory=True
)
else:
# For single worker, we can use the app instance directly
import uvicorn.workers
uvicorn.run(
self.rp_app,
host=api_host,
port=int(api_port),
workers=1,
log_level=os.environ.get("UVICORN_LOG_LEVEL", "info"),
access_log=False
)
if int(api_concurrency) > 1:
raise ValueError(
"api_concurrency > 1 is not supported because job tracking state is "
"not synchronized across multiple Uvicorn worker processes."
)
# For a single worker, we can use the app instance directly.
import uvicorn.workers
uvicorn.run(
self.rp_app,
host=api_host,
port=int(api_port),
workers=1,
log_level=os.environ.get("UVICORN_LOG_LEVEL", "info"),
access_log=False
)

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants